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The sequences of the peplomeric S1 protein of four serologically distinct strains of the infectious bronchitis virus 
(IBV), an avian coronavirus, have been determined. The S1 protein is thought to contain the serotype-specific neutral- 
ization epitopes and to be the main target of antigenic variation. An alignment with sequences of three strains published 
previously showed that from the 545 amino acid residues only 243 have been conserved. Clustering of substitutions 
suggests that most serotype determinants are located within the first 300 amino acid residues of $1. A phylogenetic 
tree of the S1 sequences showed very variable rates of divergence. Differences in topology with a tree based on RNAse- 


T1 fingerprint data indicate that some of the IBV strains have arisen by genetic recombination. 


Avian infectious bronchitis (1B) is a worldwide dis- 
ease, which is caused by a coronavirus and results in 
a highly contagious respiratory affliction of young 
chickens or in a decrease in egg production (35). Ini- 
tially, IB was effectively controlled by vaccination with 
live attenuated IB virus (IBV). However, this did not pre- 
vent outbreaks caused by variant viruses (70, 73). 
RNase T1 fingerprinting analysis showed that field 
strains isolated from such outbreaks are related to vac- 
cine strains (24), suggesting that the new strains have 
originated from vaccine virus by antigenic variation. As 
in other RNA viruses, antigenic variation is probably fa- 
cilitated by the relatively high error rate of transcription 
(107%) during the transcription of the RNA template and 
the absence of a proofreading mechanism (78, 33). 

It is generally assumed that the serotype of IBV is 
determined by the glycoprotein E2, which is the struc- 
tural component of the peplomers, the typical club- 
shaped structures projecting from the surface of the 
virus. E2 is processed proteolytically to two noncova- 
lently bound peptide chains, $1 and S2 (6, 34). S2 con- 
tains the C-terminal half of the sequence, including the 
transmembrane anchor and two long a-helices that 
form the stalk of the peplomer (74). S1 forms the top 
part of the peplomer and presumably carries the sero- 
typical determinants. This assumption is based on the 
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findings that all strongly neutralizing monoclonal anti- 
bodies recognize $1, and that immunization with puri- 
fied S1, but not with virus lacking $1, induces neutraliz- 
ing antibodies (5, 7, 29). 

To investigate the serotypic variation of IBV at the 
molecular level, we determined the S1 sequences of 
four IBV strains (73, 24): H120, an attenuated vaccine 
strain of serotype A; D207, the reference strain of sero- 
type B; D1466, a vaccine strain of serotype C; and 
V1397, a recent Dutch isolate from serotype A/C. 
These sequences were compared with the sequence 
of three strains published previously (3, 4, 3): M41, a 
pathogenic strain from serotype A used in a killed-virus 
vaccine; M42, a nonpathogenic laboratory strain of se- 
rotype A; and 6/82, a recent British field isolate of sero- 
type B (70). 

Virus strains were obtained from the Poultry Health 
Institute, Doorn, The Netherlands. Details on the isola- 
tion and passage history have been described (24). 
Strains were passaged once in the allantoic cavity of 
10-day-old chicken embryos. Virus stocks were stored 
at —70°. Virus growth, isolation of genomic RNA, cDNA 
synthesis, cloning, and sequencing were carried out 
essentially as described previously (37). With V1397, 
cDNA synthesis was primed using random calf thymus 
pentanucleotides (Pharmacia). By screening of colo- 
nies with probes containing S1 sequences of the M41 
strain, S1 clones of D207 and H120 were detected. 
Partial sequencing of a D1466 clone with a large insert 
yielded S1 sequences that could be used as a probe to 
obtain D1466 as well as V1397 clones. 

Most sequences were based on two or more inde- 
pendent cDNA clones. Only with strain D207, three nu- 
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M41U,H © MLVTPLLLVTLLCVLCSAALYDSSSYVYYYQSAFRPPNGWHLHGGAYAVVNISSESNNAG 60 
M42S,H A Vv s Q oF 
H120 A D = 
D207 ERS A SA N FGNN_ D E VT Ss 
D207* ERS A SA N FGNN_ D D VT s 
6/82 ERS A SA N FGNN_ sD E VT Ss 
D1466 WASL SV FA - ECSIVGEN T Q K L T_ETDI Y - 
V1397 AQL A SA -GECSIVGEN T Q K L _ETDI YD-A 
M41U,H  SSPGCIVGTIHGGRVVNASSLAMTAPSSGMAWSSSQFCTAHCNFSDTTVFVTHCYKYD-G 120 
M42S,H s T I a Aer, HG 
H120 $s T I pitas Yi Se HV 
D207 TT- TA A YWSKNFS A V QN Ss TE _T FV KGP 
D207* TT- TA A YWSKNFS A V QN Ss TE _T FV NGP 
6/82 TT- TA A YWSKNFS A V QN Ss TE __T FV SGH 
D1466 vV--T K I IEA S-FVTKTPI ANGV TY _Y SLY GGSGHT 
V1397 Aw--T K I IEA S-FVTKTPI AQGV TY _Y SLY GGRGHN 
M41U -CPITGMLQKNFLRVSAMKNGQLF - - YNLTVSVAKYPTFK-SF-QCVNNLTSVYLNGDLV 180 
M4lH R ane es 
M42s Qi a R int 
M42H L Q LI aan R —= 
H120 QHSI han Sas 
D207 S L LIPQYHI I _ss __ AT R L MM 
6/82 § L LIPQ HI I _ss __ AT R L MM 
D1466 S - INTNRIGELVLG-V DFSGNWI _R IKAIG- YS FTAW LAF F N 
v1397 § - INTNRIGEIVLG-V SFSGNWI _R IQATG- YS FTAW LAF F N 
M41U YTSNETTDVTSAGVYFKAGGPITYKVMRKVKALAYFVNGTAQDVILCDGSPRGLLACQYN 240 
M41H ge E aoe 
M428,H Sst E 2 
H120 aa ER SS 
D207 F K SA 4H E ‘os T 
6/82 Fk SA SE E <a T 
D1466 S FE AAG A SVNGLKRRI KDTDV __ VE V D R 
v1397 S FE AA A TVNGLKRRI KDTDV __ VE V DNKR 
M41U TGNFSDGFY PFINSSLVKQKFIVYRENSVNITFTLHNFTFHNETGANPNPSGVQNILTYQ 300 
M41H aes ———. = ee oe Q 
M425 a To T 7c Le Q 
M42H wis To er; Tas Q 
H120 = Ti. IY, Mae 2s 2) 4g eo Q 
D207 a T CE S§ _ LETS VSNT TG T QL 
6/82 oa T ££ S _ LETS VSNT TG T QL 
D1466 __T L  VSYNV NNSVV EVI TT YCKNI _ PAGNA FIK 
v1397 __T L_ VSNNV NDSVV DVI TT YGK NI. _ SP AGNA FIK 
M41U,H TQTAQSGYYNFNFSFLSSFVYKESNFMYGSYHPSCNFRLETINNGLWFNSLSVSIAYGPL 360 
M425 K ss 
M42H K aA K 
H120 = 
D207 § L I A DY KK LG I 
6/82 I | eee I A DY K LG =o 
D1466 HVVPE FVRL_-s«TYRQ DT KA s MT $ 
v1397 HVLPE FVRL_==-s-«TYRQ DT KA s M T s 


Fic. 1. Amino acid sequences of IBV S1 proteins. The sequence from M41 variant M41U has been listed completely; from other strains only 
the differences with this sequence are shown. The sequences from M41, its cDNA variant M41U* (Ala on position 398, unpublished) and from 
the M42 variant M42S are from Niesters et a/. (37). The sequence of M42 variant M42H is from Binns et a/. (3). The sequences of M41 variant 
M41H and of strain 6/82 are from Binns et a/. (4). D207* represents a cDNA variant of strain D207 (Glu-48 as main variant was found by 
sequencing independent clones, Lys-117 by direct RNA sequencing). Dashes were introduced to align the sequences. Potential glycosylation 
sites (NXS or NXT, X # P) are underlined. Conserved cysteine residues are in boldface. 


cleotide differences between cDNA clones were ob- 
served, two of them leading to amino acid substitu- 
tions. During the sequence determination of strain 
M41, one difference between two cDNA clones was 
observed. From the nucleotide sequences (Submitted 
to the EMBL, GenBank, and DDBJ nucleotide se- 


quence databases), amino acid sequences have been 
deduced. In Fig. 1, the amino acid sequences are listed 
together with the S1 sequences published previously. 
As observed earlier (37), sequence differences on a 
number of positions show that different laboratory vari- 
ants of a strain can exist; our H120 sequence has 
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Fic. 1—Continued. 


five differences with M41 not reported by Cavanagh 
et al, (9). 

The length of the S1 protein varies between 535 and 
538 amino acids, including the signal peptide and the 
arginine-rich cleavage site between $1 and 82 (6). The 
sequences could be aligned by assuming deletions/in- 
sertions at 14 positions. Two S1 proteins can have 
different amino acids in up to 49% of the positions of 
the sequence (Table 1). However, from the 17 to 19 


TABLE 1 


DIFFERENCE MATRIX OF S1 SEQUENCES OF IBV STRAINS? 


M41U M42S H120 D207 6/82 D1466 V1397 


cysteine residues, 16 have been completely con- 
served, as are most of the glycosylation sites. Presum- 
ably, the sequence variability is the combined result of 
the accumulation of neutral substitutions and the posi- 
tive selection of antigenic variants; the relative high fre- 
quency of nonsilent mutations in several parts of the S1 
sequence (not shown) suggests a positive selection. 

To localize the most variable regions, the number of 
different amino acid residues found on all 545 positions 
is plotted in Fig. 2. On 243 positions, the amino acid 
residues are conserved in all strains. Although there 
are no clearly defined hypervariable regions—as for ex- 
ample, in the HIV envelope protein (30), the VP1 protein 
of foot-and mouth-disease virus (2, 77), and the rotavi- 
rus VP7 protein (76, 77)—-there are relatively many re- 
placements in the regions 50-170 and 250-310. Inser- 


ae = 4d y oa aah per: Hk tions/deletions were mainly found in the region 120- 
co 7 ok a 215 215 ee ee 170. These observations suggest a localization of most 
D207 (B) 226 216 215 — 11 488 48.0 of the serotypic and antigenic determinants in the N- 
6/82 (B) 22.4 216 2156 1.1 — 484 47.28 terminal half of the S1 subunit. 

Di466(C) 444 45.0 440 488 484 — 5.8 A more accurate definition of the antigenic determi- 


V1397(A/C) 44.4 45.0 440 480 47.8 5.8 _ 


*The figures represent the percentages of nonidentical amino 
acids. The designations of the strains are as in Fig. 1. From each 
strain, only one variant has been listed. 


nants may be derived from a comparison of similar se- 
quences of strains with different serological properties. 
It has been suggested (9, 37) that in serotype-A strain 
the clustered substitutions in two regions, HVR 1 (56- 
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Fic. 2. Number of different amino acid residues per position in the sequences listed in Fig. 1. 


69 in the numbering of Fig. 1) and HVR 2 (117-133) 
coincide with neutralization epitopes. Indeed, a muta- 
tion in HVR 1 prevented neutralization by two different 
monoclonal antibodies (9). From the six differences be- 
tween the serotype-B strains D207 and H6/82, three 
are in a region corresponding to HVR 2. 

The epitopes of IBV recognized by neutralizing 
monoclonal antibodies against S1 are conformation 
dependent (unpublished data). As described else- 
where (25; Kusters et a/., unpublished results), the 30 
N-terminal residues of the S2 subunit contain several 
overlapping conformation-independent epitopes that 


6/82 (B) 
D207 (B) 


D1466 (C) 


V1397 (A/C) 


evoke a weak neutralizing response. Since these epi- 
topes cross-react with antisera against different sero- 
types, they are not relevant for the serotype of the virus. 

An alignment of the S1 nucleotide sequences was 
used to calculate a distance matrix, from which the 
most likely phylogenetic tree was inferred by a program 
distributed as part of the PHYLIP package 2.6 (75). The 
topology and branch lengths of this tree, shown in Fig. 
3A, were not affected by shuffling the order of se- 
quences. Intriguingly, there are differences in topology 
between this tree and a tree based on RNase-T1 fin- 
gerprints (Fig. 3B). First, M41, M42, and H120 are 
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V1397 (A/C) 


D1466 (C) 


D207 (B) 
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Fic. 3. Phylogenetic relationship between IBV strains. (A) Tree of S1 sequences. The percentage of nucleotide relatedness was calculated 
for the strains listed in Table 1. The figures indicate percentages of replaced nucleotides. The tree is unrooted, i.e., the position of the hypotheti- 
cal ancestral IBV sequence is unknown. (B) Tree representing genomic relatedness. Percentages are estimated from RNase-11 fingerprint 
analysis (7). The serotype is indicated between parentheses. The data are from Kusters et a/. (24), except for M42 (72; J.G.K., unpublished). 
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placed apart in the T1 tree (<95% identity) but have 
closely related S1 sequences (297.5% nucleotide 
identity). Second, the considerable divergence of the 
S1 sequences of strains D1466 and D207 is not re- 
flected in the T1 tree, nor in a tree (not shown) based 
on sequences of the E1 genes (8). Third, V1397 is re- 
lated to H120 in the T1 tree (=99.5% overall sequence 
similarity), but to D1466 in the S1 tree. 

Theoretically, the first two of these discrepancies 
might be resolved by assuming extreme variations of 
the evolutionary rate within the viral genome. However, 
this would not explain the similar S1 sequences of 
V1397 and D1466 vs the common RNase-T1 spots of 
V1397 and H120. Instead, we propose that genetic re- 
combination has played a role in the generation of anti- 
genic variants. For instance, V1397 may have acquired 
a D1466-like peplomer gene. In the murine coronavirus 
MHV, recombination occurs at a rather high frequency 
(19, 20, 26) and may alter the serotype of the virus (27). 
For IBV, conditions that favor recombination are cre- 
ated in the field by vaccination of chickens with more 
than one attenuated IBV strain. Thus, infections of cells 
with two different strains, leading to the formation of 
recombinants, may very well have occurred. 

RNA recombination has been well documented for 
picornaviruses (27-23, 28, 32). It would be interesting 
to test our hypothesis that recombination also plays a 
role in the generation of new IBV variants. Such a test 
might be based on the localization of the recombina- 
tion site or on /n vitro recombination experiments. 
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